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' Nonparametric data envelopment analysis (DEA) estimators have 

been widely applied in analysis of productive efficiency. Typically 
they are defined in terms of convex-hulls of the observed combina- 
E"^ , tions of inputs x outputs in a sample of enterprises. The shape of 

the convex-hull relies on a hypothesis on the shape of the technology, 
^ C"| defined as the boundary of the set of technically attainable points 

in the inputs x outputs space. So far, only the statistical properties 
of the smallest convex polyhedron enveloping the data points has 
been considered which corresponds to a situation where the tech- 
nology presents variable returns-to-scale (VRS). This paper analyzes 
the case where the most common constant returns-to-scale (CRS) hy- 
pothesis is assumed. Here the DEA is defined as the smallest conical- 
hull with vertex at the origin enveloping the cloud of observed points. 
In this paper we determine the asymptotic properties of this estima- 
. tor, showing that the rate of convergence is better than for the VRS 

estimator. We derive also its asymptotic sampling distribution with 
a practical way to simulate it. This allows to define a bias-corrected 
estimator and to build confidence intervals for the frontier. We com- 
pare in a simulated example the bias-corrected estimator with the 
original conical-hull estimator and show its superiority in terms of 
median squared error. 
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1. Introduction. Consider a convex set \I/ in IR^~ which, takes the form 
* = {(x,y)€lR^ +1 :0<y< 5 (x)}, 

where g is a nonnegative convex function denned on such that g{ax) = 
ag(x) for all a > 0. Suppose that we have a random sample (Xj,l^) drawn 
from a distribution which is supported on ty. In this paper, we are inter- 
ested in estimating the "boundary" function g from the random sample. In 
particular, we study the asymptotic distribution of the estimator 

(1) 5(x)=max{y>0:(x,y)G$}, 

where H! is the convex-hull of the rays Rj = {(7X4, 7!^) : 7 > 0} for all sample 
points (Xi,Yi). 

The problem arises in an area of econometrics where one is interested in 
evaluating the performance of an enterprise in terms of technical efficiency 
In this context, Xj is the observed input vectors of the zth enterprise, Yi is 
its observed productivity and \I> is the production set of technically feasible 
pairs of input and output. The property that g(ax) = ag(x) for all a > 0, 
or, equivalently, ^ = for all a > 0, is called "constant returns-to-scale" 
(CRS), and the commonly used estimator of ^ in this case is the CRS-version 
of the data envelopment analysis (DEA) estimator defined by 

{n n ~\ 

(x, y) E : x > ^ 7^, y < ^ 7^ for some 7* > 0, i = 1, . . . , n \. 
i=l i=l J 

In fact, ^0 given above is nothing else than the smallest convex set contain- 
ing all the rays Rj and the hyperplane {(x,0) :x € W}. To see this, suppose 
that (x,y) belongs to \I/o- Then, there exist 7« > such that x > X^ILiT*-^-* 
and y < Yli=ili^i- For these constants 7^, define 



for 1 <i <n. Then XT=i%*^ = V- Since x - E"=l7i x i > EiLi%* X *' we 
have x* = x-ELi 7*X, > 0. This shows (x, y) = 7* *i) + (x*, 0). 

The estimator g defined in (1) and the one based on ^0 are identical with 
probability tending to one if the density of (Xj,li) is bounded away from 
zero in a neighborhood of the boundary point (x, g(x)). 

The problem that we describe in the first paragraph can be generalized 
to the case of vector- valued y £ M q . This is particularly important in the 
specific problem that we mention in the above paragraph where productivity 
is typically measured in several variables. For this, we consider a conical- hull 
of a convex set A in K 7 ^ 9 which is given by 

* = {(x, y) E B£_ +9 : there exists a constant a > such that (ax, ay) € ^4} U {0}. 
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The set ^ is convex and satisfies the CRS condition 

(2) a$> = * for all a > 0. 

We are interested in estimating the "directional edge" of ^ in the y-space, 
defined by 

A(x, y) = sup{A > : (x, Ay) G <£} 

using a random sample from a density supported on \E'. In the case where 
q = 1, the directional edge is linked directly to the boundary function g by 
the identity g(x) = y\(x,y). We consider the estimator 

(3) A(x,y) = sup{A>0:(x,Ay)€$}, 

where is the convex- hull of the rays Rj = {(7X4, 7Yj) 17 > 0} for all sam- 
ple points (Xj,Yi). 

To date, nonparametric data envelopment analysis (DEA) estimators have 
been discussed or applied in more than 1800 articles published in more than 
400 journals [see Gattoufi, Oral and Reisman (2004) for a comprehensive bib- 
liography] . DEA estimators are used to estimate various types of productive 
efficiency of firms in a wide variety of industries as well as governmental 
agencies, national economies and other decision-making units. The estima- 
tors employ linear programming methods, similar to the one appearing in 
(3), along the lines of Charnes, Cooper and Rhodes (1978) who popularized 
the basic ideas of Farrell (1957). 

Typically these DEA estimators are indeed defined in terms of convex- 
hulls of the combinations of inputs x outputs (Xj, Yj) in a sample of firms. 
The shape of the convex-hull relies on a hypothesis on the shape of the 
technology defined as the boundary of the set iff of technically attainable 
points in the inputs x outputs space. So far, only the statistical proper- 
ties of the smallest convex polyhedron enveloping the data points has been 
considered which corresponds to a situation where the technology presents 
variable returns-to-scale (VRS). Convergence results for DEA-VRS have 
been derived by Korostelev, Simar and Tsybakov (1995) in the case of uni- 
variate output and by Kneip, Park and Simar (1998) in the multivariate 
case. Asymptotic distribution of the DEA-VRS estimators was obtained in 
the bivariate case (p = q = 1) by Gijbels et al. (1999), for univariate output 
by Jeong and Park (2006) and for the full multivariate case by Jeong (2004) 
and Kneip, Simar and Wilson (2008). 

VRS is a flexible assumption, but in many situations the economist as- 
sumes that the technology presents CRS: the first version of the DEA estima- 
tor derived by Farrell (1957) was for this situation. Here the DEA estimator 
iff is defined, as above, after (3), as the smallest conical- hull with a vertex 
at the origin enveloping the cloud of observed points. The properties of this 
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estimator have not been investigated, yet it was conjectured that one would 
gain some efficiency in the estimation by imposing the appropriate CRS 
structure to the estimator. 

In this paper we determine the asymptotic properties of the DEA-CRS 
estimator defined in (3), showing that the rate of convergence is better than 
that of the VRS estimator. We derive also its asymptotic sampling dis- 
tribution with a practical way to simulate it. This allows us to define a 
bias-corrected estimator and to build confidence intervals for the frontier. 
We compare, in a simulated example, the bias-corrected estimator with the 
original DEA-CRS estimator and show its superiority in terms of median 
squared error. 

2. Rate of convergence. In this section we give the first theoretical result, 
the convergence rate of the estimator A, as defined in (3), in the general case 
of p, q > 1. Before presenting the result, we first give two lemmas which will 
be used in the proof of the first theorem. 

Lemma 1. For any a, /3 > 0, it holds that A(ax,/3y) = ^A(x,y) when- 
ever (ax, /3y) £ \& and (x,y) £ \&. The same identity holds for A. 

Proof. The lemma follows from the CRS property (2) since 

sup{A > : (ax, A/5y) G = sup j A > : ^x, — y^ G ^ 

The following lemma is also derived from the convexity of *S> and \E'. 

Lemma 2. For all r G [0, 1] and for all (xi, yi), (x 2 , y 2 ) G ^> , 
A[r(xi,yi) + (1 - r)(x 2 ,y 2 )] > rA( Xl , yi ) + (1 - r)A(x 2 ,y 2 ). 
The same inequality holds for A. 

Our first theorem on the rate of convergence relies on the following as- 
sumptions. In what follows, we fix the point in where we want to estimate 
A, and denote it by (xo,yo). Throughout the paper, we assume that (Xj, Yj) 
are independent and identically distributed with a density / supported on 
^ C x R q + and that (xo,yo) is in the interior of VP. 

(Al) A(x,y) is twice partially continuously differentiable in a neighborhood 
of (x ,y ). 

(A2) The density / of (X,Y) on {(x,y) e*:||(x,y) - (x , A(x , y )y ) || < 
e} for some e > is bounded away from zero. 
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Theorem 1. Under the assumptions (Al) and (A2) ; it follows that 
A(x ,y ) - A(x ,y ) = O p (n" 2 /^)). 

Proof. We apply the technique of Kneip, Park and Simar (1998). Put 
B p (t, r) = {x E : ||x— 1|| < r} and consider the balls near xo : C r = B p (~x.^\ 

h/2), r = 1, . . . ,2p where XQ 2jf ^ = xo — hej, Xq 2 ^ = xo + h&j, ej is the unit 
p- vector with the jth element equal to 1 for j = 1, 2, . . . ,p. Similarly, define 

D s = B q (y^ , h/2) for s = 1, . . . , 2q. Take h small enough so that C r xD s c$ 
for all r = 1, . . . , 2p and s = 1, . . . , 2q. For r = 1, . . . , 2p, consider the conical 
hull of C r , 

C r = {x e : 3a > such that ax <E C r }. 
Similarly, define T> s . Define 

(U r ,V s )= argmin A(Xj,Yj). 

(Xi,Yi)eC r xT> s 

Since the number of points in X n falling into ^ n [C r x T> s ] is proportional 
to nh p+q ~ 2 , we have by assumption (A2), 

(4) \(XJ r ,V s ) = l + O p (n- 1 h- p -' 1+2 ), r = l,...,2p,s = l,...,2q. 

Let U* = a r XJ r and V* = [3 S V s for r = 1, . . . , 2p and s = 1, . . . , 2q where 
a r and j3 s are positive constants such that U* G C r and V* G D s . Then from 
Lemma 1, (4) and the fact that A, A > 1, it holds that for r = 1, . . . , 2p and 
s = l,...,2q, 

A(U*,V*)-A(U r ,V,)-A(U r ,V s )- i + W h 

which implies that A(U r *, V*) > A(U*, V*) + O p {n~ 1 h~ p ~i +2 ). Since C r and 
D s are balls surrounding the point (xo,yo), there exist scalars w r > and 
uj s > such that ^.=1^ = !> E«=i w s = 1, *0 = Sr^l^U* and y = 
X^sli^sV*. Thus, from the assumption (Al) we have 

2p 2q 

J2J2w r u s X(V*,V* s ) = A(x ,y ) + O p (h 2 ) 

r=l s=l 

for all r and s. This, with Lemma 2 and the fact that A > A, shows that 

2p 2q 

A(x ,y ) > A(x ,y ) > ^X> r ^A(U r *, V s *) 

r=l s=l 

2p 2q 

>Y,Y, WrUj ° x ( lJ r,v:)+o p (n- i h^ +2 ) 



r=l s=l 

A(x ,y ) + O p (h 2 ) + O^n^/r^ 2 ). 
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Taking h ~ n l /^ p+< ^ completes the proof of the theorem. □ 

Remark 1. In the case where ^ is a convex set in W +q without hav- 
ing the CRS property (2), the DEA (data envelopment analysis) estimator 
defined as in (3) with replaced by the convex- hull of (Xj,Yj) is com- 
monly used. In this case, the DEA estimator of A(xo,yo) is known to have 
n - 2 /(p+9+ 1 ) ra t e f convergence which is slightly worse than n~ 2 ^ p+q ^ [see 
Kneip, Park and Simar (1998)]. The CRS property reduces the "effective" 
dimension by one. 

3. Asymptotic distribution. In this section we derive a representation 
for the asymptotic distribution of the estimator A defined in (3). This rep- 
resentation enables one to simulate the asymptotic distribution so that one 
can correct the bias of the estimator to get an improved version of A. We 
work with the case where q = l first and then move to the general case where 
q > 1 . The result for the case q = 1 is essential for the generalization to q > 1 . 

3.1. The case where q = 1. We consider the set 

$ = {(x,y)eA c xR + :0<y<g(x.)}, 

where g is a nonnegative convex function defined on a conical-hull A c of a 
convex set A C such that 

(5) g(ax) = ag(x) for all a > 0, 
and that, for all Xi,X2 € A c with xi ^ ax2 for any a > 0, 

(6) g{ocx.\ + (1 - a)x 2 ) > ag(xi) + (1 - a)g(x. 2 ) 

for all a £ (0,1). In this case, A(xo,yo) =5 f ( x o)/2/o so that the problem of 
estimating A(xo,yo) reduces to that of estimating the function g at xo. The 
estimator of <?(xo) that corresponds to A(xo,yo) defined in (3) is given by 

(7) ff(xo) = 2/oA(x , y ) = sup{y : (x , y) G 

We note that the CRS condition (5) is satisfied, not only by linear functions 
of the form p(x) = c T x, but also by those functions g(x) = c{x\ + • • • + x r p ) 1 l r 
for all positive numbers c and positive integers r. 

Define Si by Sj = {X.J,Yi). Below we describe a canonical transformation 
Ton$ such that the transformed data T(Si) behave, asymptotically, as an 
i.i.d. sample from a uniform distribution on a region that can be represented 
by a simple (p— 1) -dimensional quadratic function in the transformed space. 
The reduction of the dimension, by one, for the boundary function is due 
to the CRS property (5). This is consistent with the dimension reduction as 
we noted in Remark 1 in the previous section. 
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The key element in the derivation of the asymptotic distribution of <?(xo) 
is to project the data Sj onto a hyperplane which is perpendicular to the 
vector xo and passes through xo- The projected points lie under the locus 
of the function g on the hyperplane, and the estimator g(xo) equals the 
maximal y such that (xo,y) belongs to the convex-hull of the projected 
points. The asymptotic distribution of the estimator <?(xq) is then obtained 
by analyzing the statistical properties of the convex-hull of the projected 
points. 

Let Q be a p x (p — 1) matrix whose columns constitute an orthonormal 
basis for Xq", the subspace of W that is perpendicular to the vector xo. 
Think of the transformation 



This transformation maps x to a vector which corresponds to x in the new 
coordinate system where the axes are xo and the columns of Q. The first 
component of Ti(x) is nothing other than the projection of x onto the space 
spanned by xo, and the vector of the rest components is its orthogonal 
complement in MP. Thus, the inverse transform T-j - is given by 



where z T = (zi,zj). 

It would be more convenient to use a transformation that takes xo to 
the origin in the new coordinate system. This can be done by the following 
transformation: 



Scaling by the factor ||xo|| 2 /xq x is introduced to factor out a common scalar 
for the inverse map of Ti. In fact, ||xo|| 2 /xq~x equals the scalar c such that 
the projection of cx onto the linear span of xq equals xq itself. Thus 



Note that x x > if x ^ since then xq,x > 0. It is easy to see that 







so that the inverse transform of is given by 




T 2 (x ) = 0. 
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Define a (p — 1) -dimensional function g* by <7*(z 2 ) = <?( x o + Q z 2)- For a 
function ip, let ip and ^ denote, respectively, the gradient vector and the 
Hessian matrix of ip. Since, for any u E R p_1 , 

u T 5*(z 2 )u = (Qu) T c/(x + Qz 2 )(Qu) 

and also (Qu) T (Qu) = u T u, it can be seen that g* is convex if g is convex. 
In particular, (6) implies the strict convexity of g* . Note that g* does not 
have the CRS property (5), however. 

Next, we introduce a further transformation on the new coordinate sys- 
tem (z,y). This transformation maps the equation y = g*(z2) to a perfect 
quadratic equation in the further transformed space. Since g* is strictly con- 
vex, — g*(0)/2 = Q T (— 5(xo)/2)Q is positive definite and symmetric. Thus, 
there exist an orthogonal matrix P and a diagonal matrix A such that 
— g*(0)/2 = PAP T . The columns of P are the orthonormal eigenvectors, 
and the diagonal elements of A are the eigenvalues of the matrix —g*(0)/2. 
Let T3 be a transformation that maps W to W defined by 

(8) Tsiz^Oziy/^+^PA 1 / 2 ) 1 ". 

Note that this transformation does not change z\, the first component of z. 
Also, define a map T 4 :l p xl^lby 



(9) T 4 :(z,y)^n 



2/(p+i) 



y[ ' „ „ ) -g*(o)-g*(o)'z 2 



. z i + ll x o| 

The transformation we apply to the data (Xj, Y^) is now defined by 

T:(x,y)^(T 3 oT 2 (x),T4(T 2 (x),y)). 

We explain how the equation y = g(x) can be approximated, locally at 
(xo,yo)j by a (p — l)-dimensional quadratic function in the new coordinate 
system transformed by T. Let (v, w) € M p x R represent the new coordinate 
system obtained by the transformation T. Write v T = (v±, v 2 ) with v 2 be- 
ing a (p — l)-dimensional vector. Then, the inverse transform of T maps v 
and w, respectively, to 

x= (^^Vxo+n-^gpA- 1 /^], 



^i^^j [g*(0) + n- 1 /(P+l)g*(o) T PA- 1 / 2 v 2 + n" 2 ^ +1 ) 
ll x o|| / 



if 



Thus, for arbitrary compact sets C\ C R p 1 and C 2 C R, we obtain using 
the CRS property (5) that, uniformly for v\ € R + , v 2 E C\ and w E C 2 , 

V = ff(x) 

O 5 *(0) + n- 1 /^ +1) 5*(0) T PA- 1 / 2 v 2 + n-^tP+^w 



ASYMPTOTIC DISTRIBUTION OF CONICAL-HULL ESTIMATORS 9 
= <7*(n- 1 /(P+i)p A -i/2 V2) 

-R> W = — vjv2 + o(l) 

as n tends to infinity, provided that g* is continuous at 0. 

Now we give a representation of the limit distribution of g as given in (7). 
Define 

(10) 0=||x o ||/ u p f(ux ,ug(x ))du, 

Jo 

(11) K = 0det(A)- 1 / 2 . 

Define a set -R n ( K ) C M p of points (v2,w) such that 

V2 G [_i k -i/(p+i) 71 i/&h-i) > i^-i/dH-iJni/CP+DjP-^ 

w G [_ V T V2 _ K -2/(p+ 1 ) n 2/( P +l) ) _ V T V2 ]_ 

The volume of this set in M p equals n«~ . Let (V21, W$) be a random sample 
from the uniform distribution on R ti (k). This random sample can be gener- 
ated once we know k. Let Z n (-) be defined as g in (7) with ^ being replaced 
by the convex-hull of (V 2 %,Wi); that said, 

{n n n ^ 

^ JiWi : v 2 = ^ 7iV 2i , ^ 74 = 1, 7f > 0, i = 1, . . . , n > . 
i=l i=l i=l J 

(12) 

For a small e > 0, define a set on R^_ + by 

H e (x ) = {(u(x + Qz 2 ),u(g(x + Qz 2 ) - y)) :u > 0, 

(13) 

||z 2 || <e,0<y<e}. 

In the theorem below and those that follow, we will measure the distance 
between two distributions by the following modification of the Mallows dis- 
tance: 

d(pi,to) = >f {E(Z, - Z 2 ) 2 A 1 : C(Z 1 ) = m,C{Z 2 ) = M2>- 
Convergence in this metric is equivalent to weak convergence. 



Theorem 2. Assume (Al) and (A2). In addition, assume that —g* is 
positive definite and continuous at and that the density f of (X, V) is uni- 
formly continuous on H e (xq) for an arbitrarily small e > 0. Let L n \ and L n2 
denote the distributions of n 2 /( p+1 )[g(xo) — <?(xq)] and Z n (0), respectively. 
Then, d{L n \,L n2 ) — > as n tend to infinity. 
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Computation of the distribution of Z n solely depends on knowledge of k. 
Thus one can approximate the distribution of ff(xo) by estimating k and 
then simulating Z n with the estimated k. The approximation enables one to 
correct the downward bias of <?(xo) and get an improved estimator of g(xo). 
Estimation of n and bias-correction for g(xo) will be discussed in Section 4. 

Proof of Theorem 2. We first give a geometric description of the 
estimator g. Consider a hyperplane in M p defined by 

(14) P(x ) = {x€RP :X( [(x-xo) = 0}. 

This hyperplane is perpendicular to the vector xo and passes through xo- Let 
Pi be the point where the ray Rj meets the hyperplane "P^(xo) = 'P(xo) x 
in W +1 . It follows that 

(15) p 4= W^y,). 

Define ^(xo) to be the convex-hull of the points Pj. We claim that 

(16) $(x ) = P t (x )n$. 

This means that ^(xo) is a section of \l/ obtained by cutting ^ by the 
hyperplane V' (xo). The fact that ^(xo) C V' (xo)n\I/ follows from convexity 
of P f (x ) and The reverse inclusion also holds. To see this, let (x,y) G 
Since \I/ is the convex- hull of the rays R,j, it follows that there 
exist 7* > such that x = X^=i7i*-X-i and y = X^=i7«*^- Since (x,y) G 
■p^(xo), we have 



(17) E^ x o Tx * 



l x o| 



|2 



i=l 

Let & = (xd Xi/||x || 2 ) 7 * > for 1 < i < n. Byjl7), = 1- By (15), 

we get (x,y) = X^"=i& P i which shows (x,y) G ^(xo). 

Since U a > a ^ f ( x o) = the CRS property of $ and (16) thus yield 

(18) § = (J a£(x ) = {(ax, ay) : (x, y) G $(x ), a > 0}. 

a>0 

Recall the definition of g in (7). Also, note that, for x G 'P(xo), we have 
(x,y) G ^ if and only if (x,y) G V&(xo). This follows from (18) and the fact 
that a = 1 is the only constant a > such that (x, y) G a x I / (xo) if x G 'P(xo). 
This gives 

(19) 5(x) = sup{y:(x,y)G$(x )} if x G P(x ). 
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See Figure 1 for an illustration in the case of p = 2 and q = l. 

Let Q be the matrix defined in the paragraph that contains the definition 
of the transformation T\ early in this section. Since V(xq) = {xo + Qz 2 G 
R p + :z 2 G W" 1 }, the set, 

(20) (x ) = {(x + Qz 2 , y) G A c x R + : z 2 € E^ 1 , < y < <?(x + Qz 2 )}, 

equals the section of \& obtained by cutting \I> by the hyperplane ^(xo); 
that is, \&(xo) = ^(xo) (~]^>. In the new coordinate system 

(z,y / )^(T 2 (x),y||xo|| 2 /(xo r x)), 

the set *(xo) in (20) can be represented by {0} x ^*(xo) where 

(21) **(x ) = {(z 2 , y') : z 2 G R^Xq), < y' < g*(z 2 )} 

and R p-1 (xo) denote the set of z 2 such that xo + Qz 2 G A c . Also, in that 
new coordinate system the points Pj defined in (15) correspond to (0, P*) 
where P* = (Z 2i ,Y/), Z 2l = (||x || 2 /x r X l )Q T X i and Y( = (||xo|| 2 /x T X;)Y;. 
Since convex-hulls are equivariant under linear transformations, this means 
that in the new coordinate system, ^(xo) corresponds to {0} x ^*(xo) where 
^*(xo) is the convex-hull of the points P*. Now define 

<T(z2) = g{x + Qz 2 ) 

on R p_1 (xo). Since (xo + Qz 2 ,y) G ^(xo) is equivalent to (z 2 ,y) G ^*(xo), 
it follows from (19) that 

(22) g* (z 2 ) = sup{y : (z 2 , y) G §* (x ) , z 2 G R^ 1 }. 




O 



Fig. 1. An illustration o/"P(xo), P;, $ and g in the case of p = 2 and q=l. The crosses 
are the points P;, and the gray surface is the roof of the conical-hull estimator fy. 
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Let / denote the density of the original random vector (X, Y) and /* 
denote the density of the transformed vector (2i 2 ,Y'). The arguments in 
the preceding paragraph imply that the distribution of <?(xq) — g(xo) equals 
that of g*(0) — g*(0) where g* is the convex- hull estimator of g* constructed 
from a random sample of size n generated from the density /*. Let k* = 
det(A) -1//2 /*(0,g*(0)) where A is the diagonal matrix with its entries being 
the eigenvalues of —g*(0)/2. Define Z* as a version of g* constructed from 
a random sample from the uniform distribution on R n (n*) C MP where R n 
is defined immediately after (11). Then one can proceed as in the proof of 
Theorem 1 of Jeong and Park (2006) to show that the asymptotic distribu- 
tion of ra 2 /( p+1 )(g* (0) — g*(0)) is identical to that of Z*(0) where one uses 
the transformations Tg and T| defined by 

T 3 *:z 2 ^n 1 /( p+1 )A 1 / 2 P T z 2 , 

T 4 * : (z 2 ,y>) H- n 2 '^ +l \y' - g*(0) - g*(0) T z 2 ). 

Recalling the definitions of the transformations T3 and T4 in (8) and (9), 
respectively, T3 (z 2 ) equals T^(z) without the first component, where z T = 
(z\, zj), and T|(z 2 ,y||xo||/(,zi + ||xo||)) = T^{z,y). Below, we prove that k* 
equals k defined in (11) so that Z* = Z n in distribution which concludes the 
proof of the theorem. 

Let T* denote the transformation that maps (x, y) to 

(z,y') = (T 2 (x),y||xo|| 2 /(x^x)). 

Let c{z\) = (zi + ||xo||)/||xo|[. The Jacobian of the inverse transform of T* 
equals 



J(z) = c(zi) det[||x || 1 (x + Qz 2 ),c(z 1 )Q] 
= c(zi)det 1/2 



l + (||z 2 ||/||x ||) 2 (c( Zl )/\\MW 



2 

2 



(c(zi)/||x ||)z 2 c(zi) 2 I p -i 

where I p -i denotes the identity matrix of dimension (p — 1). The second 
equality in the above calculation follows from the fact that the columns of 
Q are perpendicular to xo. Thus the joint density of T*(X, Y) at the point 
(z,y') is given by J(z)f(c(z 1 )(x + Qz 2 ),c(z 1 )y'). The density f*(z 2 ,y') is 
simply the marginalization of this joint density with respect to z\ so that 

/oo 
J(z)f(c(z 1 )(x + Qz 2 ),c(z 1 )y')dz 1 . 
-|l x o|| 

Now, since J(z±,0) = c{z\) p ', we obtain 

/oo 
c(z 1 ) p f(c(z 1 )K ,c(z 1 )g*(0))dz 1 
-llxnl 
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(a) x0=(.25,.75) (b) x0=(.75,.75) 




-0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 



Fig. 2. Solid curves are the empirical distribution functions of Z n (0), and the dotted 
curves are those of n 2// ' p+1 ^ {<?(xo) — g(xo)} in the case where n = 100 and X = 3. 

where 9 is defined in (10). □ 

To see how well the distribution of n 2 ^ p+l ^ {^(xo) — g(xo)} is approxi- 
mated by that of Z n (0), we took a Cobb-Douglas CRS production func- 
tion <?(x) = x\ A x X2' 6 (p = 2). We generated 5000 random samples of size 
n = 100 and 400 from f(x 1 ,x 2 ,y) = \x^ AX x7, °- 6A y A - 1 supported on ^ = 
{(x\,X2,y) :0 < xi,X2 < 1,0 < y < g(x\,x 2 )}. This yielded i.i.d. copies of 
(X 1 ,X 2 ,Y) with Xi ~ Uniform[0,l], X 2 ~ Uniform [0, 1] and Y = 
g(X l ,X 2 )e~ v / x where U~Exp(l). Fi gures 2 and 3 depict the empirical 
distributions of n 2 ^ p+l "> {g(xo) — </(xo)} and Z n (0) based on these samples 
in the case where A = 3. The figures suggest that the approximation is fairly 
good for moderate sample sizes and get better as the sample size increases. 

Theorem 2 excludes the case where g is linear; that is, g(x) = c T x for 
some vector c. The latter case needs a different treatment. In the following 
theorem, we give the limit distribution in this case. To state the theorem, 
let (V^VF/O be a random sample from the uniform distribution on the 
p-dimensional rectangle, 

R L {6) = [10-1/(P+I) n i/(P+1) l r l/(H-l) n V(p+l)lP-l 
(23) 2 

x [-r 2 ^ +1 V/^,0], 

where 9 is defined in (10). The volume of this set in W equals n9~ l . Let 
Z^(-) be a version of Z n {-) constructed from (V^, W/') replacing (V 2 i, Wi). 



14 B. U. PARK, S.-O. JEONG AND L. SIMAR 



(a) x0=(.25,.75) (b) x0=(.75,.75) 




-030 -0.25 -O.20 -0.15 -O.10 -O.05 O.OO -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 



Fig. 3. Solid curves are the empirical distribution functions of Z n (0), and the dotted 
curves are those of n 2 ^ p+1 * > {<?(xo) — g(xo)} in the case where n — 400 and X = 3. 

Theorem 3. Assume (Al) and (A2). Assume further that * = {(x,y) € 
MJ^ 1 : < y < c T x} for some constant vector c ^ and that the density f of 
(X, Y) is uniformly continuous on H £ (xq) for an arbitrarily small e > 0. Let 
L n \ and L' n2 denote the distributions of n 2 /( p+1 ) [g(xo) — c T xo] and Z^(0), 
respectively. Then d{L n i,L' n2 ) — ?■ as n tends to infinity. 

PROOF. In this case we consider the following transformation: 
(24) T L : (x, y) m- (T 3 l o T 2 (x) , T[ (T 2 (x) , y)) , 

where T^ 1 : z i— >■ (zi, n 1// ^ +1 ^zJ)~ r and 

Tf : (z, y) -> n^+ 2 ) f -|^Ly - c T x - c T Qz 2 ) . 

V^i + IfoII / 

Let (V L , W L ) = T L (X, F). Then it can be shown as in the proof of Theorem 
2 that the density of (V£, W L ) is given by ra 1 ^{l + o(l)} uniformly for 
and w L in any compact sets of respective dimension. The rest of the proof 
is the same as that for Theorem 2. □ 

In the special case where p = 1 , we can derive the limit distribution ex- 
plicitly. In this case, the boundary function g is linear and takes the form 
g(x) = cx for some constant c > 0. The transformation T L in (24) reduces 
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to 



T L (x,y) = ( x -x ,n[ ~x - cx 



x 

The marginal density of W L , where (V , W ) = T L (X,Y), is approximated 
by the constant n~ 1 9 uniformly for w L in any compact subset of IR_ where 
in this case equals xq f uf(uxo, ucxq) du. According to Theorem 3, the limit 
distribution of n(g(xo) — g(xo)) equals the limit distribution of which is 
nothing else than max" =1 in this simplest case where are a random 
sample from the uniform distribution on [— n6~ l ,0]. Since — max™ =1 has 
the exponential distribution with mean 9~ l in the limit, we have 

P[n(g(x ) - g(x )) < w] ->• 1 - exp(-6>w;) 

for all w > 0. 

3.2. The case where q > 1. In this section we extend the results in the 
previous section to the case where q > 1 and ^ is a conical-hull of a convex 
set A in R^ +<? . For this we make a canonical transformation on y-space so 
that the problem for q > 1 is reduced to the case where q = 1 . Again we fix 
the point (xo,yo) where we want to estimate the function A. 

Let r be a q x (q — 1) matrix whose columns form a basis for yj~. Consider 
a transformation T that maps y G to (u,uj) G IR 9 " 1 x IR + where 

(25) u = r T y, w=m. 

Ilyoii 

Then, in the new coordinate system (x, u, uj), the set \£ can be represented 
as 



(26) *r = | (x, u, w)ef + x M 9 x M + : |^x, Tu + ^ jj^TT ) € * 
Define a (p + q — 1) -dimensional function 

g T (x, u) = g T (x, u; y ) = sup<^ a > : ( x, Tu + a-rr-^ ) G ^ 



|yoi 

This is a boundary function in the transformed space such that all points 
(x, u, uj) in \&7~ lie below the surface represented by the equation u = g(x, u). 
Convexity of the function gj- follows from the fact that, due to convexity 

of 



and 



Xjru + a-jj^ii- 



a G | a' > 0: (x',lV +«'^) e*J, 
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together, imply 

aao + (1 — a)a' 

G la > : |ax + (1 - a)x', T(au + (1 - a)u') + a jj^j| 1 ^ *}• 

Also, it has the CRS property (5) since ^ satisfies (2). Furthermore, since 
(x,y) G * if and only if (x,T(y)) € and T(ay ) = (0 T , a||y ||) T for all 
a > 0, we obtain 

#7-(x ,0) = sup ja > 0: fxo, a "jj~jj"l e *| 

= sup{a>0:(x ,(0,a)) G ^ r } 

(27) = ||y || sup{A > 0: (x , (0, A||y ||)) G * T } 
= ||yo||sup{A>0:(xo,T(Ayo)) G * r } 

= l|yo||A(x ,yo). 

Here and below, denotes the (q — 1) -dimensional zero vector. Thus the 
problem of estimating A(xo,yo) using (Xj, Yj) is reduced to that of estimat- 
ing <77-(xo,0) in the transformed space using (Xj,T(Yj)). 

We note that in the proof of Theorem 2 we use only convexity and the 
CRS property of g. Thus the theory we developed in the previous section 
is applicable to gj. Let (Uj,f]j) = T(Yj) where Uj is the vector of the first 
{q — 1) elements of T(Yj), and fij is the scalar- valued random variable. The 
joint density of (Xj,Uj,f2j) at the point (x, u,u) is given by 

(28) / r (x, u, u) = det 1/2 (r T r)/ fx, Tu + "j^jj ■ 

The constant 9 defined in (10) that corresponds to the density fj- equals 

/•oo 

r =ll(xo,O)|| / n p+<? "V r (nxo,0,n ffr (xo,0))^ 
Jo 

= det 1 /2( r Tr)||x || fV+<7-v(uxo,^r(xo,0)/0 du 

Jo V \\yo\\J 

/•oo 

= det 1 / 2 (r T r)||x || / ^- 1 /(«x ,nA(x ,yo)yo)dn, 
Jo 

where the last identity follows from (27). The determinant that corresponds 
to det(A) in the definition of k in (11) is det(— QJ^-g-j-feo, 0)Q-j-/2) where 
<5t is a (p + Q ~ 1) x {p + Q ~ 2) matrix whose columns form an orthonormal 
basis for (xq,0)^. Thus we modify the definition of k as 



Kr = r det(-Q|£ T (x o ,O)Q r /2 
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Recall that the construction of Z n defined in (12) depends only on k and 
p. Define Z n ^j- as a version of Z n with kj- and (p + q — 1) replacing k 
and p, respectively. Also, define a (p + q — 2)-dimensional function glj-fa) = 
gr({x ,O) + Qr z 2), and H e> r(xo,0) as F e (x ) at (13) with (p + q-1), 
gj-, (xo,0) and Qj- replacing p, g, xo and Q, respectively. Then we have 
the following theorem for the limit distribution of A(xo,yo) for arbitrary 
dimensions p,q> 1 . 

Theorem 4. Assume (Al) and (A2). In addition, assume that —g^- 
is positive definite and continuous at 0, and that the density fj- given at 
(28) is uniformly continuous on H e f(xo,0) for an arbitrarily small e > 0. 
Let L n i and L n2 denote the distributions of n 2 ^' p+q ^ [A(xo, yo) — A(xo,yo)] 
and Z n) 7-(O p + 9 _2)/||yo||, respectively. Then, d(L n ±, L n -i) — > as n tends to 
infinity. 

Theorem 4 excludes the case where ^ = {(x,y) 6 MP^ q : cjx — c 2 y > 0} 
for some constant vectors ci,C2 > 0. Below we treat this case. When q = 1, 
this corresponds to the case where the boundary function g is linear in x. 

Define 

_ llyoll ( ci \ 
r "4y Vr T (-c 2 );- 

Then tyf defined in (26) takes the form 

*r = I ( x , u, to) : < w < Cj- 

and it holds that 

C T (0) = llyo||A(x ,yo). 

Thus we can apply the arguments leading to Theorem 3 with p, c, xo and 
Q being replaced by (p + q — 1), Cf, (xq,0) and Qj, respectively. 

Let R 1 ^ q-(d c T) be the rectangle defined in (23) with and p being replaced 
by 6j- and (p + q — 1). Define Z^j- as Z^ using a random sample from the 

uniform distribution of the (p + q — l)-dimensional rectangle R^j-(0j-). By 
applying the proof of Theorem 3 to cj- replacing c, we get the following 
theorem. 

Theorem 5. Assume (Al) and (A2). Assume further that * = {(x,y) € 
Mp^ q : cjx — cjy > 0} for some constant vectors c±, c 2 > and that the den- 
sity fj- given at (28) is uniformly continuous on H £i -t(xq,0) for an arbitrar- 
ily small e > 0. Let L n \ and L' n2 denote the distributions ofn 2 ^ p+q ^ [A(xo, yo) — 
A(x ,y )] and Z^ r (O p+? _ 2 )/||yo ||, respectively. Then d(L nl ,L' n2 ) -s> as n 
tends to infinity. 
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4. Estimation of n and K7-. We discuss how to estimate n as denned in 
(11) for the case where q = 1. It is straightforward to extend the methods to 
the case where q > 1 via the canonical transformation that we introduced in 
Section 3.2. 

Consider the set £f e (xo) C defined in (13). The projection of this set 
on the x-space is a conical hull around the vector xo, and for each direction 
of the ray xo + Qz 2 , determined by z 2 , its section on that direction is also a 
conical hull of single dimension under the boundary g. For each fixed u > 0, 
let 

i? £ (u;x ) = {(u(x + Qz 2 ),y) : ||z 2 || < e, 

g(u(x + Qz 2 )) -ue<y< g(u(x + Qz 2 ))}. 

This is a section of H £ (xq) obtained by cutting H £ (xq) perpendicular to xo 
at the distance it||xo|| from the origin. Its volume in the cutting hyperplane 
u"Pt(xo), where T-^(xo) is defined between (14) and (15), equals 

v £ (u) = c p - 1 u p e p , 

where c r denote the volume of the r-dimensional unit ball, that is, c r = 
r(r/2+i) W ^ ^( Z ) = /o°° t z ~ le ~ f dt. Thus, as e — > we have 

poo P 

P[(X,Y)e# e (x )]= / / f( X ,y)d X dydu 
JO J (x,j/)e-ff e (w;x ) 

/ (^x , ug (x ) ) v £ (u) du { 1 + o( 1 ) } 

POD 

= c p -ie p / u p f{vx ,ug(x ))du{l + o(l)}. 
Jo 

This consideration motivates the following estimator of 9: 



n 



(29) 9=\\x \\c;\n~ l e~ P Y,I((^Y l )eH £ (x )), 

8=1 

where H £ (xo) is the sample version of i7 e (xo) with g replaced by g in its 
definition. Note that, for implementing 8, it is convenient to use the fact, 

(X^etf^xo) ||Z 2i ||< £ , r(Z 2i )-e<^/<5*(Z 2i ). 

It is straightforward to see that is a consistent estimator of under the 
conditions of Theorem 2. 

For estimating det(A), one can apply local polynomial fitting to {(Z 2 i, 
g*(Z 2 j))}. For a small 5 > 0, perform a second-order polynomial regression 
on the set of the points 

{(Z 2i ,g*(Z 2i )):\\Z 2t \\ <«M = l,2,...,n}U{(0,r(0)}, 
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to get 

(30) g*(z)=g + g 1 z + z'g 2 z. 

Use det(g2) as an estimator of det(A). An estimator of k is then denned by 
k = 8det(g 2 )- 1/2 . 

Using the estimator of k one can obtain a bias-corrected estimator of the 
function g*. For this, one generates Z n repeatedly as described at (12) using 
the estimated k. Call them Z n ±, Z n<2 , ■ ■ ■ , Z Uj b- A bias-corrected estimator 
is then defined by 

r(0)-n- 2 ^ +1 )z n ,.(0), 

where Z n ,.(0) = B' 1 J2b=i Z n,b(0). Also, a 100 x (1 — q)% confidence interval 
is given by 

[r(0 ) -n- 2 /^Z ri:{B{1 _ a/2)) (0),g*(0) - n~ 2 ^ Z rh{Ba/2) (0)], 

where Z n ^(0) are the ordered values Z n j(0) such that Z n n\(0) > Z n i 2 )(0) > 
■>Z n! (B) (0). 

5. Numerical study. In this section we investigate, by a Monte Carlo 
experiment, the behavior of the sampling distribution of the DEA-CRS 
estimator in finite samples. To be more specific we will compare if the bias- 
corrected estimator suggested above has better properties than the original 
DEA-CRS estimator in terms of median squared error. 

For our Monte Carlo scenario, we adapted the scenario proposed in Kneip, 
Simar and Wilson (2008) to our setup. The efficient frontier is defined with 
a CRS generalized Cobb-Douglas production function, 

Yu = X$ A X$- (l caBU), 
Y 2e = X^X^sinu, 

where the random rays are generated through oj ~ Uniform(if,ff) and the 
values of the inputs X by (X\,X 2 ) ~ Uniform[10, 20] 2 . Then inefficient firms 
are generated below the efficient frontier by 

(Yi,Y 2 ) = {Y le ,Y 2e )e~ v ^ where U~Exp(l). 

So we are in a situation with p = q = 2, and we will analyze the estimation 
of the efficiency score of the fixed point xo = (15, 15), yo = (10, 10). It is easy 
to see that the true value of the parameter to estimate is Ao = A(xo,yo) = 
1.0607. We analyze the cases n = 100 and n = 400. 

We performed 500 Monte Carlo simulations and computed the squared 
errors of the original DEA-CRS estimator and of the bias-corrected estima- 
tor. Table 1 summarizes the results. It gives the ratios of the median of the 
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Table 1 



Ratio R e ,a 


of the median of the squared 


errors of the bias- corrected estimator over the 




median of the squared errors 


of the original DEA 


-CRS estimator 




(n = 100) 




(n = 400) 




Ratio of median 




Ratio of median 


e = S 


of squared errors 


e = S 


of squared errors 


3.50 


0.7123 


3.25 


0.6500 


3.75 


0.6863 


3.50 


0.6402 


4.00 


0.7264 


3.75 


0.6965 


4.25 


0.8081 


4.00 


0.7026 


4.50 


0.8213 


4.25 


0.7734 



squared error of the two estimators, 

R = med{(Ao, J -A ) 2 ,j = l,2,...,500} 

E ' 5 med{(A 0j - A ) 2 , j = 1,2,..., 500} ' 

where Aoj and Aoj denote the original DEA-CRS estimate and the bias- 
corrected estimate computed in the j'th Monte Carlo replication, respec- 
tively. Note that the bias-corrected estimator relies on the values of the 
smoothing parameters (e,5) which appear in the definitions (29) and (30), 
respectively. 

It is observed from the table that the bias-correction works very well 
for a wide range of the smoothing parameters, even though the smooth- 
ing parameters were taken to be equal in the simulation study for saving 
computational costs. We see also that the performance of the bias-corrected 
estimator gets better when compared to the original DEA-CRS as the sam- 
ple size increases. 

6. Discussion. In this paper we developed the theoretical properties of 
the DEA estimator defined in (3) in the case where the support ^> of the 
data (Xj, Yj) satisfies the CRS condition (2). The assumption of CRS may 
be tested. In fact, whether the underlying technology exhibits CRS or VRS 
is a crucial question in studying productive efficiency. The question has 
important economic implications. If the technology does not exhibit CRS, 
then some production units may be found to be either too large or too small. 
Using the estimator at (3) in the case where the true technology displays 
nonconstant returns to scale results in statistically inconsistent estimates of 
efficiency and seriously distorts measures of efficiency. 

One way to test CRS against VRS is to use the test statistic defined as 

= 1 " / A(X,,Y,) \ 
n^UvRs(X,,Y 4 ) 
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where Avrs is a version of A for the case of VRS defined as in (3) but with 
replaced by the convex-hull of {(Xj, Yj)}^. By construction, 

A(X i ,Y i )>A VRS (Xi,Y i )>0 

so that p n > 0. A larger value of p n gives a stronger evidence against the 
null hypothesis of CRS in favor of the alternative hypothesis of VRS. The 
test statistic was considered by Simar and Wilson (2002). One may com- 
pute p- values or critical values using a bootstrap method. For example, a 
subsampling scheme with the subsample size determined by the procedure 
described in Politis, Romano and Wolf (2001) might work for this problem. 
For testing CRS against nonconstant returns-to-scale, which is broader than 
VRS, one may use the estimators analyzed by Hall, Park and Stern (1998) 
and Park (2001) instead of Avrs- Theoretical and numerical properties of 
these testing procedures are yet to be developed. 
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